34 research outputs found
Semi-generative modelling: learning with cause and effect features
We consider a case of covariate shift where prior causal inference or expert knowledge has identified some features as effects, and show how this setting, when analysed from a causal perspective, gives rise to a semi-generative modelling framework: P(Y,X_eff|Xcau)
Semi-Supervised Learning, Causality and the Conditional Cluster Assumption
While the success of semi-supervised learning (SSL) is still not fully
understood, Sch\"olkopf et al. (2012) have established a link to the principle
of independent causal mechanisms. They conclude that SSL should be impossible
when predicting a target variable from its causes, but possible when predicting
it from its effects. Since both these cases are somewhat restrictive, we extend
their work by considering classification using cause and effect features at the
same time, such as predicting disease from both risk factors and symptoms.
While standard SSL exploits information contained in the marginal distribution
of all inputs (to improve the estimate of the conditional distribution of the
target given inputs), we argue that in our more general setting we should use
information in the conditional distribution of effect features given causal
features. We explore how this insight generalises the previous understanding,
and how it relates to and can be exploited algorithmically for SSL.Comment: 36th Conference on Uncertainty in Artificial Intelligence (2020)
(Previously presented at the NeurIPS 2019 workshop "Do the right thing":
machine learning and causal inference for improved decision making,
Vancouver, Canada.
Backtracking Counterfactuals
Counterfactual reasoning -- envisioning hypothetical scenarios, or possible
worlds, where some circumstances are different from what (f)actually occurred
(counter-to-fact) -- is ubiquitous in human cognition. Conventionally,
counterfactually-altered circumstances have been treated as "small miracles"
that locally violate the laws of nature while sharing the same initial
conditions. In Pearl's structural causal model (SCM) framework this is made
mathematically rigorous via interventions that modify the causal laws while the
values of exogenous variables are shared. In recent years, however, this purely
interventionist account of counterfactuals has increasingly come under scrutiny
from both philosophers and psychologists. Instead, they suggest a backtracking
account of counterfactuals, according to which the causal laws remain unchanged
in the counterfactual world; differences to the factual world are instead
"backtracked" to altered initial conditions (exogenous variables). In the
present work, we explore and formalise this alternative mode of counterfactual
reasoning within the SCM framework. Despite ample evidence that humans
backtrack, the present work constitutes, to the best of our knowledge, the
first general account and algorithmisation of backtracking counterfactuals. We
discuss our backtracking semantics in the context of related literature and
draw connections to recent developments in explainable artificial intelligence
(XAI)
Causal Effect Estimation from Observational and Interventional Data Through Matrix Weighted Linear Estimators
We study causal effect estimation from a mixture of observational and
interventional data in a confounded linear regression model with multivariate
treatments. We show that the statistical efficiency in terms of expected
squared error can be improved by combining estimators arising from both the
observational and interventional setting. To this end, we derive methods based
on matrix weighted linear estimators and prove that our methods are
asymptotically unbiased in the infinite sample limit. This is an important
improvement compared to the pooled estimator using the union of interventional
and observational data, for which the bias only vanishes if the ratio of
observational to interventional data tends to zero. Studies on synthetic data
confirm our theoretical findings. In settings where confounding is substantial
and the ratio of observational to interventional data is large, our estimators
outperform a Stein-type estimator and various other baselines
Kernel Two-Sample and Independence Tests for Non-Stationary Random Processes
Two-sample and independence tests with the kernel-based MMD and HSIC have
shown remarkable results on i.i.d. data and stationary random processes.
However, these statistics are not directly applicable to non-stationary random
processes, a prevalent form of data in many scientific disciplines. In this
work, we extend the application of MMD and HSIC to non-stationary settings by
assuming access to independent realisations of the underlying random process.
These realisations - in the form of non-stationary time-series measured on the
same temporal grid - can then be viewed as i.i.d. samples from a multivariate
probability distribution, to which MMD and HSIC can be applied. We further show
how to choose suitable kernels over these high-dimensional spaces by maximising
the estimated test power with respect to the kernel hyper-parameters. In
experiments on synthetic data, we demonstrate superior performance of our
proposed approaches in terms of test power when compared to current
state-of-the-art functional or multivariate two-sample and independence tests.
Finally, we employ our methods on a real socio-economic dataset as an example
application
Kernel-based independence tests for causal structure learning on functional data
Measurements of systems taken along a continuous functional dimension, such
as time or space, are ubiquitous in many fields, from the physical and
biological sciences to economics and engineering.Such measurements can be
viewed as realisations of an underlying smooth process sampled over the
continuum. However, traditional methods for independence testing and causal
learning are not directly applicable to such data, as they do not take into
account the dependence along the functional dimension. By using specifically
designed kernels, we introduce statistical tests for bivariate, joint, and
conditional independence for functional variables. Our method not only extends
the applicability to functional data of the HSIC and its d-variate version
(d-HSIC), but also allows us to introduce a test for conditional independence
by defining a novel statistic for the CPT based on the HSCIC, with optimised
regularisation strength estimated through an evaluation rejection rate. Our
empirical results of the size and power of these tests on synthetic functional
data show good performance, and we then exemplify their application to several
constraint- and regression-based causal structure learning problems, including
both synthetic examples and real socio-economic data